Incremental Discretization for Naïve-Bayes Classifier
نویسندگان
چکیده
Naïve-Bayes classifiers (NB) support incremental learning. However, the lack of effective incremental discretization methods has been hindering NB’s incremental learning in face of quantitative data. This problem is further compounded by the fact that quantitative data are everywhere, from temperature readings to share prices. In this paper, we present a novel incremental discretization method for NB, incremental flexible frequency discretization (IFFD). IFFD discretizes values of a quantitative attribute into a sequence of intervals of flexible sizes. It allows online insertion and splitting operation on intervals. Theoretical analysis and experimental test are conducted to compare IFFD with alternative methods. Empirical evidence suggests that IFFD is efficient and effective. NB coupled with IFFD achieves a rapport between high learning efficiency and high classification accuracy in the context of incremental learning.
منابع مشابه
Improvement of Decision Accuracy Using Discretization of Continuous Attributes
The naïve Bayes classifier has been widely applied to decisionmaking or classification. Because the naïve Bayes classifier prefers to dealing with discrete values, an novel discretization approach is proposed to improve naïve Bayes classifier and enhance decision accuracy in this paper. Based on the statistical information of the naïve Bayes classifier, a distributional index is defined in the ...
متن کاملEffective Discretization and Hybrid feature selection using Naïve Bayesian classifier for Medical datamining
As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity despite its assumption that attributes are conditionally mutually independent given the class label. Improving the predictive accuracy and achieving dimensionality reduction for statistical classifiers has been an active research area in datamining. Our experimental results suggest...
متن کاملAn examination of the effect of discretization on a naïve Bayes model's performance
A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. Some researches often involve continuous random variables. In order to apply these continuous variables to BN models, these variables should convert into discrete variables with limited states, often two. During the discretization process, one pr...
متن کاملA Kernel-Based Semi-Naïve Bayesian Classifier Using P-Trees
A novel semi-naive Bayesian classifier is introduced that is particularly suitable to data with many attributes. The naive Bayesian classifier is taken as a starting point and correlations are reduced through joining of highly correlated attributes. Our technique differs from related work in its use of kernel-functions that systematically include continuous attributes rather than relying on dis...
متن کاملA greedy algorithm for supervised discretization
We present a greedy algorithm for supervised discretization using a metric defined on the space of partitions of a set of objects. This proposed technique is useful for preparing the data for classifiers that require nominal attributes. Experimental work on decision trees and naïve Bayes classifiers confirm the efficacy of the proposed algorithm.
متن کامل